One class random forests
نویسندگان
چکیده
One class classification is a binary classification task for which only one class of samples is available for learning. In some preliminary works, we have proposed One Class Random Forests (OCRF), a method based on a random forest algorithm and an original outlier generation procedure that makes use of classifier ensemble randomization principles. In this paper, we propose an extensive study of the behavior of OCRF, that includes experiments on various UCI public datasets and comparison to reference one class algorithms – namely, gaussian density models, Parzen estimators, gaussian mixture models and One Class SVMs – with statistical significance. Our aim is to show that the randomization principles embedded in a random forest algorithm make the outlier generation process more efficient, and allow in particular to break the curse of dimensionality. One Class Random Forests are shown to perform well in comparison to other methods, and in particular to maintain stable performance in higher dimension, while the other algorithms may fail.
منابع مشابه
One Class Splitting Criteria for Random Forests
Random Forests (RFs) are strong machine learning tools for classification and regression. However, they remain supervised algorithms, and no extension of RFs to the one-class setting has been proposed, except for techniques based on second-class sampling. This work fills this gap by proposing a natural methodology to extend standard splitting criteria to the one-class setting, structurally gene...
متن کاملRandom Forests for Big Data
Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include data streams and data heterogeneity. Recently some statistical methods have been adapted to process Big Data, like linear regression models, clustering methods and bootstrapping schemes. Based o...
متن کاملAn Introduction to Random Forests for Multi-class Object Detection
Object detection in large-scale real-world scenes requires efficient multi-class detection approaches. Random forests have been shown to handle large training datasets and many classes for object detection efficiently. The most prominent example is the commercial application of random forests for gaming [37]. In this paper, we describe the general framework of random forests for multi-class obj...
متن کاملCustomer churn prediction using improved balanced random forests
Churn prediction is becoming a major focus of banks in China who wish to retain customers by satisfying their needs under resource constraints. In churn prediction, an important yet challenging problem is the imbalance in the data distribution. In this paper, we propose a novel learning method, called improved balanced random forests (IBRF), and demonstrate its application to churn prediction. ...
متن کاملFault Locating in High Voltage Transmission Lines Based on Harmonic Components of One-end Voltage Using Random Forests
In this paper, an approach is proposed for accurate locating of single phase faults in transmission lines using voltage signals measured at one-end. In this method, harmonic components of the voltage signals are extracted through Discrete Fourier Transform (DFT) and are normalized by a transformation. The proposed fault locator, which is designed based on Random Forests (RF) algorithm, is train...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 46 شماره
صفحات -
تاریخ انتشار 2013